Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve hash distribution with short strings #109

Merged
merged 2 commits into from
Feb 15, 2024

Conversation

JaeseungYeom
Copy link
Contributor

@JaeseungYeom JaeseungYeom commented Feb 12, 2024

Strings shorter than 128 bytes collide possible due to the implementation details of MurmurHash3_x64_128(). Especially, the hash value seems to depend on the length of such a string.
For such a short string, we concatenate it as many times as needed to make it longer than 128 bytes.
Added a test to show hash results.

For example, before this fix, test_murmur3 3 1024 data1.txt data2.txt data3.txt 123456.txt 1data5.txt data91.txt showed the following results

data1.txt	199.ca.3e.data1.txt
data2.txt	199.ca.3e.data2.txt
data3.txt	199.ca.3e.data3.txt
123456.txt	37e.c8.89.123456.txt
1data5.txt	37e.c8.89.1data5.txt
data91.txt	37e.c8.89.data91.txt

After the fix, the below is the new results.

data1.txt	2cd.1cd.3cc.data1.txt
data2.txt	11f.2cb.5a.data2.txt
data3.txt	22b.28f.11d.data3.txt
123456.txt	db.272.14.123456.txt
1data5.txt	8a.1ce.f8.1data5.txt
data91.txt	219.32f.47.data91.txt

@JaeseungYeom JaeseungYeom self-assigned this Feb 12, 2024
@JaeseungYeom JaeseungYeom added the bug Something isn't working label Feb 12, 2024
Copy link
Collaborator

@hariharan-devarajan hariharan-devarajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes makes sense.

@JaeseungYeom JaeseungYeom merged commit 24406a0 into flux-framework:main Feb 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants